African Language Technology: The Data-Driven Perspective

نویسندگان

  • Guy De Pauw
  • Gilles-Maurice de Schryver
چکیده

In this paper we outline our recent research efforts, which introduce data-driven methods in the development of language technology components and applications for African languages. Rather than hard-coding the solution to a particular linguistic problem in a set of hand-crafted rules, data-driven methods try to extract the required linguistic classification properties from annotated corpora of the language in question. We describe our efforts to collect and annotate corpora for African languages and show how one can maximise the usability of the (often limited) data with which we are presented. The case studies presented in this paper illustrate the typical advantages of using data-driven methods in the context of natural language processing, namely language independence, development speed, robustness and empiricism.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Literary Anthroponomastics of Three Selected African Novels: A Cross Cultural Perspective

Names as markers of identity are a source of a wide variety of information. This paper explores the names of characters to show the sociocultural factors which influence the choice of names and the effects that the names of these characters have on the roles they play. Using a variety of personal names from Ayi Kwei Armah’s Fragments, Buchi Emecheta’s The Joys of Motherhood, a...

متن کامل

The Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context

The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...

متن کامل

The Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context

The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...

متن کامل

Concordance-Based Data-Driven Learning Activities and Learning English Phrasal Verbs in EFL Classrooms

In spite of the highly beneficial applications of corpus linguistics in language pedagogy, it has not found its way into mainstream EFL. The major reasons seem to be the teachers’ lack of training and the unavailability of resources, especially computers in language classes. Phrasal verbs have been shown to be a problematic area of learning English as a foreign language due to their semantic op...

متن کامل

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009